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- PATENT 
Docket No. GC 381 

PROTEASES FROM GRAM-POSITIVE ORGANISMS 

FIELD OF THE INVENTION 

The present invention relates to cysteine proteases derived from gram-posuive 
microorganisnis. The present invention pro^^des nucleic acid and amino acid 
sequences of cysteine protease 1, 2 and 3 identified in Bacillus. The present invention 
also provides methods for the production of cysteine protease 1, 2 and 3 in host cells 
as well as the production of heterologous proteins in a host cell having a mutation or 
deleuon of part or all of at least one of the cysteine proteases of the present invention. 

BACKGROUND OF THE IN^VENTION 

Gram-positive microorganisms, such as members of the group Bacillus, have 
been used for large-scale industrial fermentation due, in part, to their ability to secrete 
their fermentation products mto tlie culture media. In gram-positive bacteria, secreted 
proteins are exported across a cell membrane and a cell wall, and then are 
subsequendy released into the eKtemal media usually maintaining their nadve 
conformation. 

Various gram-positive microorganisms are known to secrete e>aTacellular 
and^or intraceliular protease at some stage in their Hfe cycles. Many proteases are 
produced in large quantities for industrial purposes. A negative aspect of the presence 
of proteases in gram-positive organisms is their contribution to the overall degradation 
ofsecreied heterologous or foreign proteins. 

The ciassifi canon of proteases found in microorganisms is based on their 
cataiyuc mechamsm which results in four groups: the serine proteases; 
nieiailoproLsases; c\ stcmc proteases; and aspartic proteases. These categories can be 
disiLasuishcd b^' their scnsinvirv' to various inhibiiors. For example, the serine 
proteases are inhibited b>' phcnylmeth\'lsulfonylfluonde (PMSF) and 
diisopropyinaorophosphaie (T)]FPj; the metalloproteases by chelating agents; the 
cvsieme en^'mes by iodo:iC£tamide and heavy metals and the aspartic proteases by 
pepstatin- The serine proteases have alkaline pH optima, the metalloproteases are 
optiniallv active around neuiraliiy, and the cysteine ai^id aspartic cn7,ymes have acidic 
pH optima rBiotechno]Qc\- Handbooks. Bacillus , vol. 2, edited by Harwood, 1989 
Plenum Press, New YorkV 

The aciivir.- of c>"steLnc protease depends on a catalvtic dyad of cysteine and 
histidme v^iih the order difiering ber^veen faimiies. The best knov,n family of 
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cysieine proteases is that of papain having cataJydc residues Cys-25 and His- 1 59. 
C>'STcine proteases of the papain family catalyze the hydrolysis of peptide, amide, 
csier, thiol ester znd Lh:?r.c ef^er bonds. Kanrrally occurring inhibitors of cysteine 
proteases of the papain family are those of the cystatin family (Iv^ethods in 
Enzymology. vol. 244, Academic Press, Inc. 1994). 

SUMIVLARY Of THE r vVENTIQN 

The present invention relates to the unexpected and stirprising discover}- of 
three heretofore unknowTi or unrecognized cysteine proteases found in Bacillus 
subiilis, designated herein as CPl, CP2 and CP3 having the nucleic acid and amino 
acid as shown in Figures 1 A-IB, Figures 5AoB and 6A-6B, respectively. The present 
mvemion is based, in pan, upon the presence of the characteristic cysteine protease 
amino acid motif GXCWAF found in uncharacterised translated genomic nucleic acid 
sequences of Bacillus subriUs. The present invention is also based in part upon the 
structural relatedness that CPl has with the cysteine protease papain specifically with 
respect to the location of the catalytic histidine/alanine and asparagine/serine residues 
and the strucruiai relatedness that CPl has Vvith CP2 and CPS. 

The present invention provides isolated polynucleotide and amino acid 
sequences for CPl, CP2 and CP3. Due to the degeneracy of the genetic code, the 
present invention encompasses any nucleic acid sequence that encodes the CPl, CP2 
and CP3 ammo acid sequence shov^-n in the Figures. 

The present invention encompasses amino acid variations ofB.subrilis CPL 
CP2 and CPS amino acids disclosed herein that have proteolytic activity'. sybrihs 
CPl. CP2 and CP3 as well as proteolytically active amino acid variations, thereof 
have application in cleaning compositions. Tne present invention also encompasses 
ammo acid variations or denvatives of CPl, CP2, CP3 that do not have the 
characteristic proieolytic activity as long as the nucleic acid sequences encoding such 
^'ariations or denvatives would have sufficient 5' and 3' coding regions to be capable 
of being integrated into a grajn-positive organism genome. Such variajus would have 
applicaiions in eram-posiu\'e expression systems where it is desirable to deicte. 
mutaie. aher or oihcrwise incapacuaie tlie naturally occuiring cysisme protease m 
order to dinuiush or delete its proteolytic activit}'. Such an expression system would 
have the advantage of ajlowu:ig for greater yields of recombinant heterologous 
protems or polypeptides. 

The pre5ent invention provides metliods for detecting gram positive 
microorganism homoloc? oi B. subrih's CPl, CP2 and CPS that comprises hybridizing 
pan or all oftlie nucicic acid encoding B. subrilis CPl, CP2 and CPS with nucleic acid 
derived from gTanvpos:ti\ e organism.s, either of genomic or cDKA origin. In one 
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embodimenL the gram-positive microorganism is-selected from the group consisting 
of B, licheni/ormis, B. lenius, B, brevh\ B. siearothermophilus, B. alkalophihis, B. 
arr.yloliguefacic^is, B. con^vlant, B- drculans, B. lauius and Bacillus ihuringlensis. 

In yet another aspect, ihe present invention provides a gram-positive 
microorganism having a mutation or deletion of part or all of ihe gene encoding CPl 
and/or CP2 and/or CP3, which results in the inaciivation of the CPl and/or CP2 
and/or CP3 proteolytic activity, either aione or in combination with mutations in other 
proteases, such as apr, npr, epr, mpr for example^ or other proteases known to chose of 
skill in the art. In one embodiment of the present invention^ the gram-positive 
organism is a member of the gcnos Bacillus. In another embodiment, the Bacillus \s 
Bacillus subalis. 

The production of desired heterologous proteins or polypeptides in gram- 
positive microorganisms may be hindered by the presence of one or more proteases 
which degrade the produced heterologous protein or pol>pcpiide. One advantage of 
the present invention is that it provides methods and expression systems which can be 
used TO prevent that degradation, thereby enhancing yields of the desired heterologous 
protein or pohpepiide.ln another aspect, the gram-positive host having one or more 
cysteine protease deletions is further genetically engineered to produce a desired 
protein. 

In one embodiment of the present invention, the desired protein is 
heterologous to the gram-positive host cell In another embodiment, the desired 
protem is homologous to the host cell. The present invention encompasses a gram- 
positive host cell having a deletion or interruption of the nucleic acid encoding the 
namrallv occurring homologous protem, such as a protease, and having nucleic acid 
encoding the homologous protein re-introduced in a recombinant form. In' another 
cmbodimcDi, the liost cell produces the homologous protein. Accordingly, the present 
inveniion also provides methods and expression systems for reducing degradation of 
heterologous proteins produced in gram-positive microorganisms. The gram-posinve 
microorganism may be normally sporulating or non-sporulating. 

In a funher aspect of the present invention, gram-positive CPl, CP2 or CP3 is 
produced on an indusmaJ fermentation scale m a microbial host expression sysiem. In 
another aspeci. isolated and punfied recombinant CP] , CP2 or CP3 is used in 
compositions of matter micnded for cleaning puiposes, such as detergents. 
Accordingly, the present invcmion provides a cleaning composition compnses one or 
more of a gran-posiiive cysteine protease selected firom the group consisting of CPl, 
CP2 and CP3. The c\'Steine protease may be used alone or in combination wiili oii\cr 
enzymes and or njediators or enhancers. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure lA-lB shows the DNA and amino acid sequence for CPl (YJDE). 

Figure 2 shows an amino acid alignment with papain (accession number 
papa_catpa.p) with the cysteine protease CPl, designated YJDE. . For Figures 2, 3 and 
4, the motif GXCWaF has been marked along with the catalytic cysteine and the 
conserved catalytic histidine/alanine and asparagine/serine residues. 

Fignre 3 show^s amino acid alignment of CPl (YJDE) with CP3 (PMI). 

Figure 4 shows the amino acid alignment of CPl (YJDE) with CP2 (YdhS). 

Figure 5A-5B shows the amino acid and nucleic acid sequence for CP2 (YdhS). 

Figure 6A-6B shows the amino acid and nucleic acid sequence for CP3 (PMI), 

DET.^LED DESCRIPTION OF THE PREFERRED EMBODLMENTS 

Defmitions 

As used herein, the genus Bacillus }nc]udes all members known to those of 
skill in the art, including but not limited to B. subiilis, B. Ucheniformis, B. lantus, B. 
brevis, B. stearolhennophilus, B. alkalophilus. B, amyloliquefaciens, B. coagulans, B, 
ciculans, B. lautus and B. rhuhngiensjs. 

The present mvention encompasses novel CPl, CP2 and CPS firom gram 
positive organisms. Li a preferred embodiment, the gram-positive organisms is a 
Bacillus. In another preferred embodiment, the gram-positive organism is Bacillus 
subiili^. As used herein, ''B.subrilis CPL CP2 or CPS" refers to the amino acid 
sequences shown in Figures. Figures lA-lB show the amino acid and nucleic acid 
scqe-oncc for CP] (YJDE); Figures 5A-5B show the amino acid and nucleic acid 
sequence for CP2 (YDHSV and Figures 6A-6B show the amino acid and nucleic acid 
sequences for CPS (PMI). The present mvemion encompasses amino acid variations 
of tlie amino acid sequences disclosed in Figures 1 A-IB and 5AoB and 6A-6B that 
have proteoKnc acti\'ir\'. Such proteoKtic amino acid variants can be used in 
cleaning compositions. The presein invention also encompasses B. subrilis ammo 
acid variations or derivatives tliat are not proTeol>^lcally active. DNA encoding such 
\-ajiants can be used in mtithods designed to delete or mutate the nauirallv occurring 
host cell CPl. CP or CP3. 

As used herein, "nucleic acid"* refers to a nucleotide or polynucleotide 
sequence, and fragments or ponions thereof, and to DNA or RNA of genomic or 
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svnihetic origin vvhich may be ddubie-stranded or single-scranded, whether 
representing tlie sense or anrisense strand. As used herein ''ammo acid'' refers to " ' - 

peptide or r^rctein seouence? or Dorr:.':^n? ihereof. A ''polynucleotide homolog" as 
used herein refers to a gram -positive microorganism polynucleotide that has at least 
5 80%, at least 90% and at least 95% identity to B.subtilis CPl, CP2 or CP3, or which is 
capable of hybridizing to B.subnlis CPl, CP2 or CP3 under conditions of high 
stringency and which encodes an amino acid sequence having cysteine protease 
activity. 

The terms "isolated" or "purified" as used herein refer to a nucleic acid or 
10 amino acid tiiat is removed from at least one component with which it is naturally 
associated. 

As used herein, the term "heterologous protein" refers to a protein or 
polypeptide that does not naturally occur in a gram-positive host cell. Examples of 
heterologous proteins include enzv'mes such as hydrolases including proteases, 

15 ccilulases, amylases, carbohydrases, and lipases; isomerases such as racemases, 
epimerases, tautonierascs, or mutases; transferases, kinases and phophatases. The 
heterologous gene may encode iherapeuticaliy significant proteins or peptides, such as 
gTOv.iht factors, cytokines, ligands, receptors and inhibitors, as well as vaccines and ■ 
antibodies. The gene may encode commercialiy miponant industrial proteins or 

20 pepiides. such as proteases, carbohydrases such as amylases and glucoamylases, 
cellulases, oxidases and lipases. The gene of interest may be a naturally occurring 
gene, a mutated gene or t syniheiic gene. 

The term "homologous proiem" refers to a protein or polypeptide native or 
narurallv occunu^.g in 2 gr^mvpositive host cell The invention includes host cells 

25 producinc the homologous protein \ la recombinant DKA teclmoiogy. The present 
mvcmion encompasses a ^rranvposir.N'e host cell having a deletion or mterruption of 
the nucleic acid encodm:: :bc navorally occuirmg homologous protein, such as a 
protease, and havmg nucicic acid encoding the homologous protein re-mtroduced in a 
recombinant lorvA in ar.oihc: cu\boi\mcni, the host cell produces the homologous 

30 protein. 

As used hLT:^;n. :hj leni: '"overcxprcssuig'^ when refcring to ihe production of 
a protein in a n.os: ccli :nja:)S il-.a: the protein is produced m greater amounts than its 
produciion b\ i:s :;a:ura^i> occumri^-* cnvuonment. 

As used n^ri-ir.. phrase "'proLeolN'tic actnii\''' refers to a protein thai 55 abie 
35 to hydroiy:<e a n:^*r ;dc bo;u:. Enz^^irics ha\'ing proteoiuic activity are described m 
EnrvTHc \o:rrjn::::ra"e. ! -^^-2. L-dited \\"cbb Acadenuc Press, Inc. 
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Detailed Description of the Preferred Embodiments 

The unexpected discovery of the cysieine proteases CPl, CP2 and CPS in 
rJ iuh^\::i TTCvidfrf i b2:i^ f*^r ^^rodv^irp. hc^t cells, expression methods and systems 
which can be used to prevent ihe degradation of recombinmtly produced heterologous 
5 proteins. In a preferred embodiinentj the host cell is a gram-positive host cell that has 
a deletion or mutation in the naturally occurring cysteine protease said mutation 
resuitmg in deletion or inactivation of the production by the host cell of the 
proteolytic cysteine protease gene product. The host cell may additionally be 
genetically engineered to produced a desired protem or polypeptide. 
10 h may also be desired to genetically engineer host cells of any type to produce 

a gram-positive cysteine protease. Such host cells are used in large scale femiemation 
10 produce large quantities of the cysteine protease which may be isolated or purified 
and used in cleaning products, such as detergents. 

15 I. Cvsieinc Pr_otc_a se Sequences 

The CPl, CP2 and CP3 polynucleotides having the sequences as shoun in 
Figures lA-lB, 5A-5B and 6A-6B. respectively, encode the Bacillus snhiUis cysteine 
proteases CPl , CP2 and CP3. As vsiil be tmderstood by the skilled artisan, due to the 
degeneracy of the genetic code, a \'arier>' of polynucleotides can encode the Bacillus 

20 snhnlis CPl, CP2 and C?3. The present invention encompasses all such 
polynucleotides. 

The presen: invenuor^ erijompasses CPl, CP2 and CP3 polynucleotide 
homologs ei'iccdi)!^ gran>posi::\e niicroorganism cysteine proteases CPl, CP2 and 
CP3, respectively, wrjch r.a\e v.\ ieast S0%, or at least 90% or at least 95% identirv' to 
25 B.subiilis CP] . CP2 arui C?3 c< long as the homolog encodes a protein tliat has 
proteolyric :t:i-.vi:y. 

Gra::-p:'£:::\ e po! v:vj:i-.-o;ide homologs oiB.subiilis CPl, CP2 or CP3 may be 
obtained by siij^darJ prj-CLjurcs known in the art from, for example; cloned DNA (e.g.^ a 
DN'A "libr;::y' ; genon::: DN'A i;braries, by chemical synthesis once identified, by 

30 cDNA cioniriLv o; b>- :h- clor.m j of genomic DNA, or fragnacnts thereof purified from a 
desu-ed ccl- 'Sc. :or -vrr.px-, Sambrook e: 1989, Molecular Cloning, A Laboratory 
Manual.. Id td . Cold Srr^r.,:: } iar:vr Laboratory Press, Cold Spring Harbor, New York; 
Glover, D..\!. < ■. ^''> : . DN.\ Clorung: A Practical Approach^ MRL Press, Ltd., 
Oxrord. L.K \ o! 1 Ai \ preicrr-jd soujcc is frorri genomic DKA. Nucleic acid 

3? sequencef d.-nv-d j;.T:.~':::) j 3NA ividy contam recnJaton.' regions in additioji to 

codmc rep:.-:-.: v. n.v.j-- . : :h- so-::rce. Lhc isolated CP] . CP2 or CPS gene should be 
molec'jl c^. v".:-.: ^ \-ec:or for propagation of the gene. 
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In tiie molecular cloning of the gene firom genomic DNA, DNa fragments are 
generated, some of which will encode the desired gene. The DNA may be cleaved ai 
snecirlc siTes 'jsing ""/^rious reEtriciion enzymes. Alternatively, one may use DNAse in 
the presence of manganese to fragment the DNA, or the DNA can be physically sheared, 
5 as for example, by sonication. The linear DKA fragments can then be separated • 
according to size by standard techniques, including but not limited to, agarose and 
polyacrylamide gel electrophoresis and column chromatography. 

Once the DNA fragments are generated identification of the specific DNA 
fragment contaixdng the CP 1, CP2 or CP3 may be accomplished in a nnmber of ways. 

iO For example, a B.subriljs CPl, CP2 or CP3 gene of the present invention or its 

specific RNa, or a- fragment thereof, such as a probe or primer, may be isolated and 
labeled and then used in hybridizaiion assays to detect a gram-positive CPL CP2 or 
CP3 gene. (Benton, W. and Davis, R., 1977, Science 196 :180: Grunstem, M. And 
Hogness, D„ 1975, Proc. Natl. Acad. Sci. USA 72:396n. Those DNA fragments 

15 sharing substantial sequence similarity' to the probe will hybridize under stringent 
conditions. 

Accordingly, the present invention provides a method for the detection of 
gram-positive CPl, CP2 and CP3 polynucleotide homologs which comprises 
hybridizing pan or all of a nucleic acid sequence ofB. subiilis CPl, CP2 and CP3 

20 with gram-posnive microorganism nucleic acid of either genomic or cDNA origin. 

Also included v\dthin the scope of the present invention are gram-positive 
microorganism polynucleoiide sequences that are capable of hybndizmg to the 
nucleotide sequence of B.suhillis CPL CP2 or CP3 under conditions of intermediate 
to maximal stnngency. Kybridizauon conditions are based on the melting 

25 lemperamre (Tm) of the nucleic acid binding complex, as taught in Bergci and 

Kimmel (1987, Guide to Molecular Cloning Techniques. Methods in Enzv^moloeN'. 

Vol ] 52. Academic Press. San Dicoro CAl incorporated hercm bv reference, and 

confer a defined '\sn-ingenc\-" as explained below. 

'*Ma>:imiun sinngcnc}'"'' typically occurs at about Tm-S^'C (d'^C below the Tni 

30 of the probe); "high stringency" at about S^C to lO'^C below Tm; "intermediate 

siringenc\'" ai about lO^C to lO'-C below Tm; and ''low stringency''' at about 20~C to 

25'C be]o\^ Tm. As be understood by those of skill in the art, a maximum 

smngency hybridization can be used to identify or detect identical poKnuclcoudc 

sequences winlc aji Lniermediaie or low stringency hybridization can be used to 

iden-.if}' or deieci po!>'nuc!eoiide sequence homologs. 
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Tbe xenn "hybridization" as U5ed herein shall include "the process by which a 
strand of nucleic acid joins with a complementary strand itirough base pairing" 
(Coombs J (1994) Dictionarv' of Biotechnology, Stockton Press. New York NY). 

The process of amplification as carried out in polymerase chain reaction 
(PCR) Technologies is described in Dieffenbach CW and GS Dveksler (1995, PGR 
Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview NY), A nucleic 
acid sequence of at least about 10 nucleotides and a-s many as about 60 nucleotides 
from B, subiilis CPl, CP2 or CP3 preferably about 12 to 30 nucleotides, and more 
preferably about 20-25 nucleotides can be used as a probe or PCR primer. 

The B.suhtUis amino acid sequences CPl, CP2 and CPS (shown in Figures 2, 4 
and 3, respectively) were identified via a FASTA search of Bacillus subtilis genomic 
nucleic acid sequences. B. subtilis CPl fV'JDE) was identified by its struciural 
homology to the cysteine protease papain having the sequence designated 
*'papa_carpa.p'\ As shown in Figure 2, Y-TDE has the motif GXCWAF as well as the 
consented catalytic residues His/.AJa and Asn/Ser. CP2 (YdHS) and CPS (PMl) were 
identified upon their structural homology to CP) (YJDE). The presence of GXCW.-^JF 
as well as residues His/AJa and Asn/Ser is noted in Figures 3 and 4. CPS (PMI) was 
previously characterized as a possible phosphomannose isomerasc, (Noramata). 
There has been no previous characterization of CPS as a cysteine protease. 

11- Expression Svstems 

The present invention provides host cells, expression methods and systems for 
the enhanced production and secretion of desired heterologous or homologous 
proiems in gram-posirive rmcroorganisms. Id one embodiment, a host cell is 
gencticaJly engineered to have a dcleuon or mutation in the gene encoding a gram- 
positive CPl . CP2 or CPS such tl-iai the respective activity is deleted. In another 
embodimeni of the present mvention, a gram-positive microorganism is geneticalK" 
engineered to produce a cysieme protease of the present invention. 

[nacti\'2iion of a j^ram-posuive cvsteine protease in a host cell 
Producing an expression host cell mcapable of producing the naturally 
occurrmg cysteine protease necessitates the replacement andy-'or inactivaiion of the 
naturally occurrmg gene from the genome of the host celL In a preferred 
embodimeni. tlie mutation is a non-rcverting mutation. 
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One inethod for rnaiaiLag nucleic acid encoding a gram-positive cysteine 
protease is to clone the nucleic acid or part thereof, modify the nucleic acid by site 
directed mutagenesis and re^ntrodace the mutated nucleic acid into the cell on a 
plasmid. By homologous recombination, the mutated gene may be introduced into the 
chromosome. In the parent host cell the result is that the naturally occurring nucleic 
acid and the mutated nucleic acid are located in tandem on the cJuomosome. After a 
second recombination, the modified sequenccLis left in the chromosome having 
thereby effectively introduced tiie mutation into the chromosomal gene for progeny of 
the parent host cell. 

Another method for inactivating the cysteine protease proteolytic activit}-' is 
through deleting the chromosomal gene copy. In a preferred embodiment, the entire 
gene is delctei the deletion occurring in such as way as to make reversion impossible. 
In another preferred embodimeni, a pardal deletion is produced, provided that the 
nucleic acid sequence left in the chromosome is too short for homologous 
recombination uiih a plasmid encoded cysteine protease gene. In another preferred 
embodiment nucleic acid encoding the catalytic amino acid residues are deleted. 

Deletion of the namrally occurring gram-positive microorganism cysteine 
pro lease can be carried out as follows. A cysteine protease gene including its 5'' and 
3' regions is isolated and insened into a cloning vector. The coding region of the 
cysteine protease gene is deleted form the vector in vitro, leaving behind a sufficient 
amount of the 5' and 3' Hanking sequences to provide for homologous recombination 
Vvith the naturally occurring gene in the parent host cell. The vector is then 
transformed into the gram-positive host celh The vector integrates into the 
chromosome via homologous recombination in the flanking regions. This method 
leads to a gram-posuive stram in which the protease gene has been deleted. 

The vector used m an integration method is preferably a plasmid. A selectable 
marker may be mcludcd to allow for ease of identification of desired recombinant 
microorgansims. Additionally, as uill be appreciated by one of skill in the art, the 
vector IS preferably one which can be selectively integrated mto the chromosome. 
This can be achieved by iniroducmg an inducible origin of replication, for example, a 
lemperaFLire sensitive origin into the plasmid. By grovving the transfoimants at a 
lemperaiure to which the ongin of rcpHcation is sensitive, the replication function of 
the plasmid is inactivated, thereby providing a means for selection of chromosomal 
miegranis. Iniegranis may be selected for growh ai high lempcrauires in ihc presence 
of the selectable marker, such as an anubiotic. Integration mechanisms are described 
in WO 8S/06623. 

Iniegration by \ht Camp bell -r>i:)e mechamsm can take place in tlie 5' flanking 
region of the proieisc gene, resulting in a protease positive strain carrying the entire 
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plasmid vecior in the chromosome in rhe cysieine protease locus. Since illegiiLmaie 
recombmation uill give different resuhs it wall be necessary to determine whether the 
complete gene has been deleted^ such as Lhrough nucleic acid sequencing or restriction 
maps. 

Ano±er method of inactivating the namrally occuiring cysteine protease gene 
is to mutagenize the chromosomal gene copy b}' transforming a gram-positive 
microorganism v»itlt oligonucleotides which are mutagenic. Alternatively, the 
chromosomal cysteine protease gene can be replaced with a mutant gene by 
homologous recombijiation. 

The present invention encompasses host cells having additional protease 
deletions or mutations, such as deletions or mutations in apr, npr, epr, mpr and others 
knov.TL to those of skill in the art. 

One assay for the detection of mutants involves growing the Bacillus host cell 
on medium containing a protease substrate and measuring the appearance or lack 
thereof, of a 2one of clearing or halo around the colonies. Host cells which have an 
inactive protease will exhibit hrtle or no halo around the colonies. 

III. Production of Cysieine Protease 

For production of cysieme protease in a host cell, an expression vector 
comprising at least one copy of nucleic acid encoding a gram-positive microorganism 
CPL CP2 or CP3, and preferably comprising multiple copies, is transformed into the 
host cell under conditions suitable for expression of the cysteine protease. In 
accordance v.iih the present mvention, polynucleotides which encode a gram-positivc 
microorganism CPL CP2 or CP3. or fragments thereof, or fasion proteins or 
poj\'nucleoLide homolog sequences that encode amino acid variants of B.subtilis CPl . 
CP2 or CP3, may be used to generate recombinant DNA molecules that direct their 
expression m host cells. In a preferred embodimenL the gram-positive host cell 
belongs lo the genus Bacillus, In anoiher preferred embodimenL the gram positive 
hcsi cell is B. subiilis. 

As will be undersiood by those of skill in the art, it may be advantageous to 
produce polynucleoiide sequences possessing non-narurally occurring codons. Codons 
preferred b\ a particular cram-positive host cell (Murray £ et al (1989) Nuc Acids Res 
1 7:477-508') can be selected, for example, to increase the rate of expression or to 
produce recombinant RKA transcripis having desirable properties, such as a lonser 
half-life, thari crajiscripis produced from namrally occurriiig sequence. 

Aliered CPl , CP2 or CP3 polynucleotide sequences which may be used m 
accordaiKe voj^h ilie invcniion include deletions, inseaions or substitutions of different 
nucleotide residues resulung in a polynucleotide that encodes the same or a 
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furLCtionally equivalent" CP 1, CP2 or CP3 ho'molog, respectively.- As used herein a 
"deleiion" is defined as a change in either nucleotide or amino acid sequence in which 
one or more: nuc-roti.'^.c? or ?jnino ?.cid residues, re^rectively, are absent. 

As used herein an "insertion" or "addition" is tlrnt change in a nucleotide or 
miiino acid sequence which has resulted in the addition of one or more nucleotides or 
amino acid residues, respectively, as compared lo die naturally occurring CP! , CP3 or 
CP3. 

As used herein "substitution" results from the replacement of one or inore 
nucleotides or amino acids by different nucleotides or amino acids, respectively. 

The encoded protein may also show deletions, insertions or substitutions of 
amino acid residues which produce a silent change and result in a functionally CPl, 
CP2 or CPS variant. Deliberate amino acid substitutions may be made on the basis of 
similarity in polarit}', charge, solubihtv', hydrophobicity, hydrophilicityj and/or the 
amphipathic naruie of liie residues as long as the variant retains the abilit\' to modulate 
secretion. For example, ncgauvely charged amino acids include aspartic acid and 
slutamic acid; positivcl>' charged amino acids include lysine and argmine; and amino 
acids v^ith uncharged polar head groups having similar hydrophilicitv' values include 
leucine, isoieucine, valine; glycine, adanine; asparagine, glutamine; serine, threonine^ 
phenylalanine, and lyrosme. 

The CP i , CP2 or CPS polynucleotides of tlie present invention may be 
engineered m order lo modify ihe cloning, processing andyor expression of the gene 
product. For exzimpie, muiations may be introduced using techniques w'hich are well 
known in tiis an, eg, sice-direcied mutagenesis to insert new restnction sites, to alter 
glycosylauon panerx-iS or lo change codon preference, for example. 

In one embodiment of the present invention, a gram-positive microorganism ' 
CPl, CP2 or C?3 polynucleotide ma\ be hgated to a heterologous sequence to encode 
a fasion Droie::; A fjsio:: r-rot^jLT. m:\y also be engineered to contain a cleavage site 
located between Lhc c\-si.':r:e protease nucleotide sequence and the heterologous 
proiein sequence, sv in:^: th- ^:^■s:e;ne protease may be cleaved ai;id purified away fjom 
the heterologous inoier. 



IV. Vecio: Suauer .cc-- 

Expression \'-cto"; -see expressing the cysteine proteases of the present 
uivemion in gr^-^vp.^.vi ■ : •. v ::i:c:oo:gamsms comprise at least one promoter associated 
with a cystein:; proiejSL A-eted irom the group consisting of CPl, CP2 and CPS, 
v.'hich nromote: ;^ :unc:u::.-! m. me hosi cell. In one embodiment of the present 
mvention, ihiC p:on-f>:e:' \'.'iid-i7/pe promoter for the selected c\'$ieine protease 

and in another emb^.'dir::c::t ;'ft!\e present invention, the promoter is heterologous lo 
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the cysteine protease, but srill fuQcuonal in the host celL In one preferred 
embodiment of the present invention, nucleic acid encoding the cysteine protease is 
si::b!y integrated ir.to the m.:crocr^ani5TTi genome. 

In a prefened embodiment, the expression vector contains a multiple cloning 
5 site cassene which preferably comprises at least one restriction endonuclease site 
unique to tl^e vector, to facilitate ease of nucleic acid manipulation. In a preferred 
embodiment, the vector also comprises one or more selectable markers. As used 
herein, the tenm selectable marker refers to a gene capable of expression in t]ie gram- 
positive host which allows for ease of selection of those hosts containing the vector. 
10 Examples of such selectable markers include but are not limited to antibiotics, such 
as, er>'thromycin,.acrinomycin, chloramphenicol and tetracycline. 

V. Transformation 

A variety of host cells can be used for the production of CPl, 

15 CP2 and CPS mcluding bactenal, fungal, mammalian and insects cells. General 

transformation procedures are taught in Current Protocols In Molecular Biology (vol. 
1, edited by Ausubel et ai., John Wiley Si Sons, Inc. 1987, Chapter 9) and include 
calcium phosphate methods, transformation using DEAE-Dextran and electxoporanon. 
Plant transformation methods are taught in Rodrique2 (WO 95/14099, published 26 

20 May 1995). 

In a preferred embodiment, the host cell is a gram-positive microorganism and 
in another preferred embodimcni, the host cell is Bacillus. In one embodiment of ihc 
present invention, nucieic acid encodmg one or more cysteine protease(s) of the 
present invention is introduced tn:o a host eel! via an expression vector capable of 

25 replicating u'lthin the Badlius host cell. Suitable replicating plasmids for Bacillus are 
described m Moieculrir Biologicai Methods ioi Bacillus, Ed. Harwood and Cuttme, 
John Wiiey Son^. :'-'90. hereby expressly incorporated by reference: see chapter 5 
on pias'v,:d£. Siurablr repiicstrng piasmids for B. subriUs are listed on page 92. 

an.o'hrr e:r;nodL*ncn:. r.ucleic :jcid encodmg a cysteine protcase(s) of the 

3C prescn* ir'Acntion is <:ab]y inicgraied mio the microorganism genome. Preferred i^osi 
cells ar- :z::c:\'posii]vt hos: cells .Another preferred host is Bacillus. .rAjiother 
preferred hosi :s Bac:iius subnl:s. Several strategies have been described in the 
literarurc To: th^: ri:reci ciomng o:" DN'A in Bacillus. Plasmid marker rescue 
TTansformnuor. m'/oi'^t-i Lhe upt^iJsC ofa donor plasmid by competent cells carryine a 

:c- paniaJly hc<njoj.:'pou-. rjsjdeni pjjisinid (Contenie et al., Plasmid 2:555-571 (1979'i; 
Haimae: n:.. Mol. Gc:. G-nci. 2:3:185-191 (1990); Weinxauch et al., J. Bacteriol. 
]54(3V]0"'^-10:r /: '>S3 r arJ Weiiirauch ec a]., J. Bacienol. 169(3): 1205-1 2 II 
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(l987)). The incommg donor plasrnid'recombmes with th~e "homologous region of the 
resident ''helper'' plasnnid In a process that mimics chromosomal tTansfonnation. 

Trans forp.*? avion by protoplast transformation is described for 5. subtilfs in 
Chang and Cohen, (1979) Mol. Gen, Genet 168:11 1-1 15; for B.megaterium in 
Vorobjeva et a]., (1980) FEMS K4icrobiol. Letters 7:261-263; for B. 
amyloliquefaciens in Smith ei ah, (1986) Appl. and Env. Microbiol 51:634; for 
B.thuringiensis in Fisher ct al., (1981) Arch. MjcrobioL 139:213-217; for 
B.sphaeri COS in McDonald (1984) J. Gen. Microbiol. 130:203; andB.larvae in 
Balidetetal., (1985) 49:577. Mann et al., (1986, Cnrrent Microbiol. 13:131-135) 
rcpon on transformation of Bacillus protoplasts and Holubova, (1985) Folia 
Microbiol. 30:97) disclose methods for introducing DNA into protoplasts using DNA 
containing liposomes. 

VL Identification of Transformants 

U^ether a host cell has been transformed with a mutated or a naturally 
occurrmg gene encoding a gram-positive CPl , CP2 or CP3, detection of the 
presence/absence of marker gene expression can suggests whether the gene of interest 
is present Howevei, its expression should be confirmed. For example, if the nucleic 
acid encoding a cysteine protease is inserted uithin a marker gene sequence, 
recombinaiU cells contaimng the insen can be identified by the absence of marker 
gene function. Alternatively, a marker gene can be placed in tandem witi:i nucleic acid 
encodmg the cysteme protease tinder the control of a single promoter. Expression of 
the marker gene in response to induction or selection usually indicates expression of 
the cysteine protease as uell. 

.Mternaiively, host cells which contain the coding sequence for a cysteine 
protease and express the protem may be identified by a variety' of procedures known 
to those of skill m the an. These procedures include, but are not limited to, DNA- 
DN.^. or DNA-RNA hybridization and protein bioassay or immunoassay techniques 
which include membrane-based, solution-based, or chip-based technologies for the 
detection and or quantincation of the nucleic acid or protein. 

The presence of the cysteine pol\Tiucleonde sequence can be detected by 
DN A-D\'A or DXA-RNa hybridiianon or amplification usmg probes, portions or 
iragmenis B.subtilis CP), CP2 or CP3. 

\TI Assav of pToiease Aci:\'ir\' 

There arc \'anous assays knouTi to those of skill in the art for detecting and 
measuring prote:ise aciivin*. There are assays based upon the release of acid-soluble 
peptides from casein or l^emoelobm measured as absorbance at 280 nm or 
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colorimetrically using the Folin method (Bergmcyer, et ai., 1984, Methods of 
Enzymatic Analysis vol. 5, Peptidases, Proteinases and their Inhibitors, Verlag 
Chemic. Wcinheirri). Oiher assays involve the soliibilL2aTion of chromogeaic 
substrates (Ward, 1983, Proteinases, in Microbial Enn'mes and Biotechnology (W.M. 
Fogart}', ed.). Applied Science, London, pp. 251ol7). 

"^'ni Secretion of Recombinant Proteins 

Means for determining the levels of secretion of a heterologous or homologous 
protein in a gram -positive host cell and detecting secreted proteins include, using 
cither polyclonal or monoclonal antibodies specific for the protein. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RiA) and 
fluorescent activated cell sorting (FACS). These and other assays are described, 
among other places, in Hampton R et al (1990, Serological Methods, a Laborator>^ 
Manual, APS Press, St Paul MN) and Maddox DE et ai (1983, J Exp Med 158:121 1). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the an and can be used m various nucleic and amino acid assays. Means for 
producing labeled hybridi2ation or PCR probes for detecting specific polynucleotide 
sequences include olicolabeling, nick translation, end-labeling or PCR amplification 
using a labeled nucleotide. Alternatively, the nucleotide sequence, or any portion of 
It, may be cloned into a vector for the production of an mRNA probe. Such vectors 
are known in the an, are commercially available, and may be used to synthesize R\'A 
probes in viiro by addition of an appropnate RNA polymerase such as T7, T3 or SP6 
and labeled nucleotides. 

A number of companies such as Pharmacia Biotech (Piscataway NJ), Promega 
(Madison WI), and US Biochemical Corp (Cleveland OH) supply commercial kits and 
protocols for these proceduies. Suitable reporter molecules or labels include those 
radionuchdes, enz>'mes, fluoresceni, chemiluminesceni, or chromogenic agents as 
well as subsiraies, cofaciors, inhibitors, magnetic particles and the like. Patents 
leaciung the use of such labels mclude US Patents 3,817,837; 3,850,752; 3,939,350; 
3.996,345; 4,277,437: 4,2"5,149 and 4,366,241. Also, recombinant immunoglobulins 
may be produced as shov.n in US Patent No. 4,81 6,567 and mcorporaied herein by 
reference. 

IX Punfi cation of Proieins 

Gram positive hosi cells uaiufonncd with polynucleotide sequences encodmg 
heterologous or homoloi:ous protein may be cultured under conditions suitable for the 
expression and recover/ or" the encoded protein from cell culture. The protein 
produced hy a recombu-ian- gram-positivc host cell comprising a mutation or deletion 
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of the cysteine protease aciiviry will be secreted inio the culroie media. Other 
recombinant consiructions may join the hererologous or homologous polynucleotide 
sequ'i-nce<^ ro nucleotide sequence encoding ?. polypeptide domain which will facilitate 
purification of soluble proteins (KjoU DJ et al (1993) DNA Cell Biol 12:441-53). 

Such purification facilitating domains include, but are not limited to. metal 
chelating peptides such as liistidine-tryptophan modules that allow purification on 
imn-iobiiized metals (Porath J (] 992) Protein Expr Purif 3:263-281), protein A 
domains that allow purificarion on immobilized iminunoglobulin. and the domain 
niilized in the FLAGS eKtension/affiniiy purification system Cimn:iuuex Corp, Seattle 
\Va), The inclusion of a cleavable linker sequence such as Factor XA or enterokinase 
(Invitrogen, San Diego CA).betw^een the purification domain and the heterologous 
protein can be used to facilitate purification. 

X Uses of Tl-ie Present Invention 

CP], CP2 and CPS and Genetically Engineered Host Cells 
The present invention pro\'ides genetically engineered host cells comprising 
preferably non-revertable mutations or deletions in the naturally occurring gene 
encoding CPl, CP2 or CP3 such that the proteolyiic activitv' is diminished or deleted 
altogether. The host cell may contain additional protease deletions, such as deletions 
of the mature subiilisn protease and/or maruxe neutral protease disclosed in United 
Slates Patent No. 5,264,366. 

In a preferred embodiment, ilie host cell is further genetically engineered to 
produce a desired protein or polypeptide. In a preferred embodiment the host cell is a 
Bacillus. Ln another prefencd embodiment, the host cell is a Bacillus subiilis. 

In an alternative embodiment, a host cell is genetically engineered to produce 
a gram-positive GPL CP2 or CP3. In a preferred embodiment, tlie host cell is groun 
under large scale fermentation conditions, the CPl, CP2 or CP3 is isolated and/or 
puiilled and used m cleaning compositions such as detergents. WO 95/10615 
discloses detergent formulation. 

CPl . CP2 and CP3 PoK-nucleotides 

A B.subtlis polynuclcoiide, or any pan thereof, provides the basis for detecting 
ihe presence of gram-positive microorganism polvTiucleoiide homologs through 
hybridization techniques and PGR leclinology. 

Accordmgh', one aspect of the present invention is to provide for nucleic acid 
h>'bridi2auon and PGR probes which can be used to detect polynucleotide sequences, 
mcludiiig genomic and cDNA sequences, encodijig gram-positive CPl, CP2 or CP3 or 
Doraons ihereof 
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The manner and meihod of carrying out ihe present invention may be 
more fully understood by those of skill in the art by reference to the following 
examples, which examples are not intended in any manner to limit the scope of the 
present invention or of the claims directed thereto 

Example I 

Preparation of a Genomic library 

The following example illustrates the preparation of a Bacillus genomic 

librar}'. 

Genomic DN A from Bacillus cells is prepared as taught in Current Protocols 
In Molecular Biology vol. 1, edited by Ausubel et al, John Wiley Sl Sons, Inc. 1987, 
chapter 2. 4.1- Generally, Bacillus cells from a saturated liquid culture are lysed and 
the proteins removed by digestion with proteinase K. Cell wall debris, 
polysaccharides, and remaining proteins are removed by selecdve precipitation with 
CTAB, and high moJeciilar Vvr'eight genomic DNA is recovered from the resulting 
supernatant by isopropanol precipitation. If exceptionally clean genomic DNA is 
desired, an additional step of purifying the Bacillus genomic DNa on a cesitun 
chloride gradient is added. 

After obtaining purified genomic DNA, the DNA is subjected to Sau3A 
digestion. Sau3A recognizes the 4 base pair site GaTC and generates fragments 
compaiible w-ith several convenient phage lambda and cosmid vectors. The DNA is 
subjected to partial digestion to increase the chance of obtaining random fragments. 

The parually digested Bacillus genomic DNA is subjected to size fractionation 
on a 1% agarose eel prior to cloning into a vector. .Alternatively, size fractionation on 
a sucrose gradient can be used. The genomic DNA obtained from the size 
fraciionaiion step is purixled away from the agarose and ligated into a cloning vector 
appropriate for use in a host ceil and transformed into the host cell. 

Example II 

Detection of gram.-posiivc microorganisms 
The lollowmg example descnbes the detection of gram-positive 
microorganism CPl . The same procedures can be used to detect CP2 and CPS. 

DNa denvcd from a gram-posnivc microorganism is prepared according to 
the methods disclosed in CujTeni Protocols in Molecular Biology, Chap. 2 or 3. The 
nucleic acid is subjected to hybridization and/or PCR an^pHfication with a probe or 
primer derived from CPl . A preferred probe comprises the nucleic acid section 
containing the conserved motif GXCW.AJ. 
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The nucleic acid pfoBe is' labeled by combining 50 pmol of the nucleic acid- 

and 250 mCi of [gamma -^-P] adenosine triphosphate (Amersham, Chicago IL) and 

T4 polv-nucliotiuc kinase (DuP^ni NEN®, Boston MA). The labeled probe is purified 

with Sephadex 0-25 super fine resin colun3n (Pharmacia). A ponion containing 1 Qp 

5 counts per minute of each is used in a typical membrane based hybridisation analysis 

of nucleic acid sample of either genomic or cDNA origin. 

The DNa sample which has been subjected to restriction endonuclease 

digestion is fractionated on a 0.7 percent agarose gel and transferred to nylon . . 

membranes (Nytran Plus, Schleicher &. Schuell, Durham NH). Hybridization is 

10 carried out for 16 hours at 40 degrees C. To remove nonspecific signals, blots are 

sequentially washed at room temperanire tinder increasingly suingent conditions up to 
0.1 X saline sodium citrate and 0.5% sodium dodecyl sulfate. The blots are exposed 
to film for several hours, the film developed and hybridisation patterns are compared 
Visually to detect polynucleotide homologs of B.subrlUs CPl. The homologs are 

15 subjected to confirmator>' nucleic acid sequencing. Methods for nucleic acid 

sequencing are well known m the art. ConventionaJ en2ymatic methods employ DNA 
pol}merase KJenow fragment, SEQUENaSE® (US Biochemical Corp, Cleveland, 
OH) or Taq poKmerase to extend DKa chains from an oligonucleotide primer 
annealed to tlie DNA template of interest. 

20 Various other examples and modifications of the foregoing description and 

examples uill be apparent ' j a person skilled in the art after reading the disclosure 
without departing from the spir;: ^rJ scope of the invention, and it is intended that all 
such examples or modific-tions be mcluded wiihinthe scope of the appended claims. 
All publicauons rmd pa:e:"i:? referenced hercm are hereby incorporated by reference in 

25 Lheir cntLrer.". 
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CLAfMS 

We claim: 

1. An isc!:ited polynucleotide encoding CPl from a gram positive microorganism. 

2. The polynucleotide of Claim 1 wherein CPl has the amino acid sequence shown in 
Figures lA-lB. 

3. ,4n isolated CPl encoding nucleic acid having the nucleic acid sequence as shown 
in Figure 1, 

4. .An isolated CPl from a gram-positive microorganism. 

5. The isolated CPl of Claim 4 having the amino acid sequence as shown in Figures 
lA-lB. 

6. An isolated polynucleotide encoding CP2 from a gram positive microorganism. 

7. The polynacleoiidc of Claim 6 wherein CP2 has the amino acid sequence shown in 
Figures 5A-5B. 

S. Tlie isolated CP2 encodmg nucleic acid having the sequence as shown in Figures 
5A-5B- 

9. .An isolaied CP2 fiom 3 L-r.-^jr^-posirive microorganism. 

1 0. The isoiaied CP2 of C'.2\rr. v h^vmg the ammo acid sequence as shown in Figures 
5AoB, 

1 1. A gram-posiiive microo:i::iJ'.:ini h2\-ing a murauon or deletion of part or all of the 
gene encoding CP] saiz rriUianon or deletion resulting in the inactivation of the 
CPl pTOtcoK'tic acnviry. 

12. A gram-posit:\'c niicrc jrgzir.irr. hjvLr.e a mutation or deletion of part or all of tlie 
gene encoding CPU sa::i r.i\.y^\\or. or ^Lc^nuon resulting In the macrivation of the CP2 
protcolviic acti'.ity. 
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IS. A gram-po'sitive microorganism-having a ra or deletion of part or all of the 

gene encoding CP3 said mutation or deletion resulting in the inactivation of tl-\e CP3 
proteol>Tic aciivity. 

14. The gram -positive microorganism according to Claims 1 1, 12 or 13 that is a 
member of the family BaciUus. 

15. The microorganism according to Claim 14 wherein the member is selected from 
the group consisting of B. licheniformis, B. lenrus, B. brcvis, B. stearothermophilus, 
B- alkalophilus, B. amyloliquefaciens, B. coagulans, B. cixculans, B. lautus and 
Bacillus thuringiensis. 

16. The microorganism of Claim 11, 12 or 13 wherein said microorganism is capable 
of expressing a heterologous protein. 

1 7. The host cell of Clahn 16 wherein said heterologous protein is selected from the 
group consisting of hormone, en2yme, grounh factor and cytokine. 

1 8- The host cell of Claim 1 7 wherein said heterologous protein is an enzyme. 

19. The host cell of Claim 1 5 wherein said enz\ine is selected from the group 
consistmg of a proteases, carbohydrases, and Upases; isomerases such as racemases, 
epimerases. lautomcrases, or mutases: transferases, kinases and phophatascs. 

20. A cleaning composiiion comprising a cysteine protease selected from ihe group 
consisting of CPl, CP2 and CPS. 

21. An expression vector com.pnsmg nucleic acid encoding a cysteme protease 
selected from the group conLSistLag of CPl, CP2 and CP3. 

22. A host cell comprising an expression vector according to Claim 21. 
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ABSTRACT 

The present mventioo relates to the identification of novel cysteine proteases 
in Gram-positive microorganisms. The present invention provides the nuclei acid 2nd 
amino acid sequences for the Bacillus subtilis cysteine proteases CPl, CP2 and CP3. 
Tlie present invention aJso pro\ndes host cells having a mutation or deletion of part or 
all of ihe gene encoding CPl, CP2 or CPS. The present invention also provides host 
cells funher comprising nucleic acid encoding desired heterologous proteins such as 
en2ymes. The present invention also provides a cleaning composition comprising a 
cysteine protease of the present invention. 
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" ~ " ' 10 - - 30 
atgacgactgaaccgttatttttcaagcctgttttcaaagaaagaatt 
MTTE PLFFKPVFKERI 

Jf^^ SO 70 90 

tggggcgggaccgctttagctgattttggctataccattccgtcacaa 
WGGTALADFGYTI PSQ 

110 130 
cgaacaggggagtgctgggcttttgccgcgcatcaaaatggtcaaagc 
RTGECWAFAAHQNGQS 

150 170 190 

gttgttcaaaacggaatgtataaggggttcacgctcagcgaattatgg 
VVQNGMYKGFTLSELW 

210 230 
gaacatcacagacatttattcggacagcttgaaggggaccgtttccct 
EHHRHLFGQLEGDRFP 

250 270 2 

ctgcttacaaaaatattagatgctgaccaggacttatctgttcaggtg 
LLTKILDADQDLSVQV 

90 310 330 

catccgaatgatgaatatgccaacatacatgaaaacggtgagcttgga 
KPNDEYANIHENGELG 



350 370 
aaaacagaatgctggtacattattgattgccaaaaagatgccgagatt 
KTECWYIIDCQKDAEI 

390 410 430 

atttatggccacaatgcaacaacaaaggaagaac taactaccatgata 
lYGHNATTKEELTTMI 

450 470 
gagcgtggagaatgggatgagctcttgcgccgtgtaaaggtaaagccg 

er"'gev;dellrrvkvk p 

490 510 5 

ggggattttttctatgtgccaagcggtactgttcatgcgattggaaaa 
GDFFYVPSGTVHAIGK 

30 550 570 

ggaattct tgctttggagacgcagcagaac tcagacacaacctacaga 
GILALETOQNSDTTYR 

590 610 
ttatatgattatgaccgaaaagacgcagaaggcaagctgcgcgagctt 
LYDYDRKDAEGKLREL 

630 650 670 

catctgaaaaagagcattgaagtgatagaggtcccgtctattccagaa 
HLKKSIEVIEVPSIPE 

690 710 



9 m 

cggcatacagttcaccatgaacaaattgaggat ttgcttacaacga^^ 
RHTVHHEQIEDLLTTT 

730 750 7 

ttgattgaatgcgcttacttttcggtggggaaatggaacttatcagga 
LIECAYFSVGKWNLSG 

70 790 810 

tcagcaagcttaaagcagcaaaaaccattccttcttatcagtgtgatt 

SASLKQQKPFLLISVI 

830 850 
gaaggggagggccgtatgatctctggtgagtatgtctatcctttcaaa 
EGEGRMISGEYVYPFK 

870 890 910 

aaaggagatcatatgt tgctgccttacggtcttggagaatttaaactc 
KGDHMLLPYG LGEFKL 

930 

gaaggatatgcagaatgtatcgtctcccatctg 
EGYAECIVSHL 
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SCORES Initl: 48 Initn: 48 Opt: 79 z-$core: 94.7 E(): 

7.2 

Smith-Waterman score: 79; 21.9% identity io 155 aa overlap 

130 140 150 j ^60 jr ' 170 180 

papa__carpa.p VLNDGDVNIPEYVDWRQKGAVTPVKNQGSCpSGWAFSAWTIEGIIKIRTGNLNEYSE 

I U I I • I • ' ' • • : • I • : • 

/. I 

YJDE PLFFKPVFKERIWGGTALADFGYTIPSQRTGEGWAFAAHQNGQSWQ— NGMYKGFTL 

10 20 30 1 40 50 60 

190 200 210 220 230 240 

papa_carpa . p LLDCDRRSYGCNGG--YPWSALQLVAQYGIHYRNTYPYEGVQRYCRSREKGPYAAKTD 
GV 

I : 1 : : ! ! : ! : ! I : : : : : ! : : I : ! : 1 : I I : 

YJDE LWEHHRKLFGQLEGDRFPLLTKILDADQDLSVQ-VHPND EYANIHENGELG-KTE 

CW 

70 80 90 100 110 

250 260 270 280 290 

papa_carpa.p RQVQPYNEGALLV S lAKQPVSWLEAJ^GKDFQLYR GGIFVGPCGNKVDHA 

VA 



YJDE YIIDCQKDAEi: VGr:::A':TKEELTTMIERGEWDELLRRVKVKPGDFFYVPSGT 

VH 

12C i:-: 140 150 160 170 

30C 320 330 340 

papa_carpa.p AVGVGPKYI L: r:::5W"7GWGENGYIRIKRGTGNSYGVCGLYTSSFYPVKN 

1 M i 

YJDE AIG'r'.GILALH:?: . N'S: 77VRLYDYDRKDAEGKLRELHLKKS lEVIEVPS I PERHTVHH 

EQ 

ISC : 200 210 220 230 
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SCORES Initl: ^41 Initn: 1252 Opt: 1256^score: 1441.1 E(): 

0 

Smith-Waterman score: 1256; 56.6% identity in 316 aa overlap 

10 20 30 ^ 40 50 

yjde . pep MTTEPLFFKPVFKERIWGGTALAD-FGYTIPSQRTpECWAFjAAHQNGQSVVQNGMyKG 
FT ' ' 

II I : I : H I M : M I I i II I I I I : I I I ; I II I i I : : II : I I : I II Ml 

I 

P^3I MTQSPIFLTPVFKEKIWGGTALRDRFGYSIPSESTGECWAISAHPKGPSTVANGPYKG 
KT ^ 

10 20 30 40 50 

60 

60 70 80 90 ' 100 110 - 1 

19 

yjde. pep LSELWEHHRHLFGQLEGDRFPLLTKILDADQDLSVQVHPNDEYANIHENGELGKTECW 
YI 



II : I II: : I : i M II i II 



PMI LIELWEEHREVFGGVEGDRFPLLTKLLDVKEDTSIKVHPDDYYAGENEEGELGKTECW 

YI 

70 80 90 100 110 1 

20 

120 130 140 150 160 170 1 

79 J. 14, 

yjde. pep idcqkdaeiiyghnattkeelttmiergewdellrrvkvkpgdffyvpsgtVhaigkg 

XL ' 

i M : : : i M M II : i : i I i : I 1 1 : I : I : it II : 1 : I II II : II I! M : M : !l 

1 

PMI idcken^'^iiyghtarsktelvtminsgdwegllrrikikpgdfyyvpsgtlhalckg 



al 

80 



39 



t ft 

130 140 150 160 170 



180 190 200 210 220 230 



yjde, pep ALETQQNSDTTVRLYr^VDKKDASGKLRSLHLKKSIEVIEVPSIPERHTVHHEQIEDLL 
TT 

: ! 1 M M ! ! : I M : 1 : M i : : ! : 1! N : I : : : : M : i : : : 

PMI VLETQQNSDATVr.Vy-VDRLDSNGSPRELKPAKAVNAATVPHVDGYIDESTESRKGIT 

190 200 210 220 230 2 



IK 
40 



240 250 26C 270 280 290 2 

99 

yjde. pep TLIECAVrSVX-KWK'LS2:SASv:-;0QK?ELLISVIEGEGRMXSGEYVYPFKKGDHMLL?Y 
GL 

I : : ■: Mil ' : : : 1 : 1 : : 11 : I M I I I : : : I : II M I : : M 
PMI TrvOGEVrSVVr:>;2: NC^'^Air-tAODESr licsviegsgllkyedktcplkkgdhfilpa 

250 26': 270 280 290 3 

00 

300 31C 
yjde . pep G£?KLZZYr.ZZ:':Z:\:. 

: I : : i ^ • : I : ^ 

PMI pdftikgtctlivsh: 3g/ fi^iAr-tL ^ 

310 



SCORES Initl 1128 Initn: 1128 Opt: 1 

0 



19 

yjde.pep 

YI . 

I I 

YDHS 
YI 



79 

yjde . pep 
IL 



YDHS 
TL 



yj de . pep 



YDHS 
1 1 



y3ae . pep 
GL 



YDHS 



y] ae . pep 
YDHS 



2-score: 1418-2 E() : 
55.6% identity in 313 aa overlap 
20 30 40 50 

mtteplffkpvfkeriwggtalad-fgytipsqr*i|gecwa^aahqngqsvvqngmykg 



Smith-Waterman score: 1236; 

10 

59 

yjde . pep 
FT 

I 

YDHS 
KT 



I : I I I : : ! I I I I I : 1 



I I I I : I I I I : 1 I i 1 



MTHPLFLEPVFKERLWGGTKLRDAFGYAIPSQKTGECWAVSAHAHGSSSVKNGPLAG 

10 20 30 40 50 

60 70 80 90 100. 110 1 

LSELWEHHRHLFGQLEGDRFPLLTKILDADQDLSVQVHPNDEYANIHENGELGKTECW 

I : : : I : I : : 1 I : ! II 1 I : I : I t I : : I I I M M I : 1 : I I : : I I M : I I M I I I 
LDQVWKDHPEIFGFPDGKVFPLLVKLLDANMDLSVQVHPDDDYAKLHENGDLGKTECW 

60 70 80 90 100 110 



120 



130 



140 



150 



160 



170 



120 
180 



IDCQKDAEIIYGHNATTKEELTTMIERGEWDELLRRVKVKPGDFFYVPSGTVHAIGKG 
lit: I 1! : I M : I : 11 II : II 1 : I : I M I : I : M 1 I I II 11 I II : 1! : M 

IDCKDDAELILGHHASTKEEFKQRIESGDWNGLLRRIKIKPGDFFYVPSGTLHALCKG 

tt 



130 
190 



140 
200 



150 
210 



160 
220 



170 
230 



ALETQQNSDTTYRLYDYDRKDAEGKLRELHLKKSIEVIEVPSIPERHTVHHEQIEDLL 

: M i i i i il I II : i i II I : : I : I II : : I : : II ! : I t : 11 : : : : : 
VLEIQQNSDTTYRVYDYDRCNDQGQKRTLHIEKAMEVITIPHIDPCVHTPEVKEVGNAE 

ft 

180 190 200 210 220 230 

240 250 260 270 280 290 2 

TLIECAYrSVGKWNLSGSASLrCQQKPFLLISVIEGEGRMISGEYVYPFKKGDHMLLPY 

: : : M I! 11 : : I I I : : : : ! ! 1 I : I M : 1 : : I : I : I : : II 
VYVQSDYFSVYKWKISGRA^^^.FPSYQTYLLGSVLSGSGRIINNGIQYECNAGSHFILPA 
240 250 260 270 280 290 

300 310 

GEFKLEGYAECIVSHL 

M 1 : M I : : } ! 

GEFTIEGTCEFMISH? 
300 310 



($2 3^1 



10 30 
atgacgcatccattatttttagagcctgtctttaaagaaagactatgg 
MT HPLFLEPVFKERLW 



50 70 90 

ggagggacgaagcttcgtgacgcttttggctacgcaataccctcacaa 

GGTKLRDAFGYAIPSQ 

110 130 
aaaacaggtgagtgctgggccgtttctgcacatgcccatggctcgtcg 
KTGECWAVSAHAHGSS 

150 170 190 

tctgtaaaaaatggcccgctggcaggaaagacacttgatcaagtatgg 
SVKNGPLAGKTLDQV W 

210 230 
aaagatcatccagagatattcgggtttccggatggtaaggtgtttccg 
KDHPEIFGFPDGKVFP 

250 270 2 

ctgctggtaaagctgctggacgccaatatggatctctccgtgcaagtc 
LLVKLLDANMDLSVQV 

90 - 310 330 

catcctgatgatgattatgcaaaactgcacgaaaatggcgaccttggt 

HPDDDYAKLHENGDLG 

350 370 
aaaacggagtgctggtatatcattgattgcaaagatgacgccgaacta 
KTECWYI IDCK DDAEL 

390 410 430 

att ttgggacatcatgcaagcacaaaggaagagttcaaacaacgaata 
ILGHHASTKEEFKQRI 

450 470 
gaaagcggtgattggaacgggctgctgaggcgaatcaaaatcaagcca 
ESGDWNGLLRRIKIKP 

490 510 5 

ggagatttcttttacgtgccaagcggtacactccatgctttatgtaag 
GDFFYVPSGTLHALCK 

30 550 570 

ggaacccttgtcctcgaaaticcagcaaaactccgatacaacatatcgc 
GTL VLFIQQNSDTTYR 

590 610 
gtatacgatitacgaccgc tgtaargaccagggccaaaaaagaactctt 
VYDYDRCNDQGQKRTL 

630 650 670 

catatagaaaaagccatggaagtcataacgataccgcatatcgataaa 
HIEKAMEVITIPHIDK, 

690 710 

351 f-i^u'^c SA- 



73o" " ' 750 " ' " ' 7 

tatgtgcaatcagattatttctcagtgtacaaatggaagattagcggc 

YV QSDYFSVYKWKISG 

70 790 810 

cgagctgcttttccttcatatcaaacctatttgctggggagtgttctg 

r^JXfpsyqtyllgsvl 

agcggatcaggacgaatcataaataatggtattcagtatgaatgcaat 
S GSGRIINNGIQYECN 

870 890 
gcaggctcacactttattctgcctgcgcattttggagaatttacaata 

AGSHFILPAHFGEFTI 
930 

gaaggaacatgtgaattcatgatatctcatcct 
EGTCEFMISHP 



10 30 
atgacgcaatcaccgatttttctaacgcctgtgtteaaagaaaaaatc 
MTQSPIFLTPVFKEKI 

50 70 90 

tggggcggaaccgctttacgagatagatttggatacagtattccttca 

WGGTALRDRFGYSIPS 

110 130 
gaatcaacgggggaatgctgggccatttccgctcatccaaaaggaccg 
ESTGECWAISAHPKGP 

150 170 190 

agcactgttgcaaatggcccgtataaaggaaagacattgatcgagctt 
STVANGPY.KGKTLIEL 

210 230 
tgggaagagcaccgtgaagtattcggcggcgtagagggggatcggttt 
WEEHREVFGGVEGDRF 

250 270 2 

ccgcttctgacaaagctgctggatgtgaaggaagatacgtcaattaaa 
PLLTKLLDVKEDTSIK 

90 310 330 

gttcaccctgatgattactatgccggagaaaacgaagagggagaactc 

VHPDDYYAGENEEGEL 

350 370 
ggcaagacggaatgctggtacattatcgactgtaaggaaaacgcagaa 
G~^KTECWYIIDCKENAE 

390 410 430 

atca t t tacgggcatacggcccgctcaaaaaccgaacttgtcacaatg 

I iyg'^htarsktelvtm 

450 470 
atcaacagcggtgactgggagggcctgctgcgaagaatcaaaattaaa 
INSGDWEGLLRRIKIK 

490 510 5 

ccgggtgatttc tattatgtgccgagcggaacgctgcacgcattgtgc 
PGDFYYVPSGTLHALC 

30 550 570 

aagggggccct tgt tt tagagac tcagcaaaattcagatgccacatac 

kgalvletqqnsdaty 

590 610 
cgggtgtacgattatgaccgccttgatagcaacggaagtccgagagag 
RVYDYDRLDSNGSPRE 

630 650 670 

cttcattt tgccaaagcggtcaatgccgccacggttccccatgtggac 
LHFAKAVNAATVPHVD 

690 710 



gggtatatagatgaatcgacagaatcaagaaaaggaataaccattaaa 
GYIDESTESRKGITIK 

730 ' - - 750 7 

acatttgtccaaggggaatatttttcggtttataaatgggacatcaat 
TFVQGEYFSVYKWDIN 

70 790 810 

ggcgaagctgaaatggctcaggatgaatcctttctgatttgcagcgtg 

GEAEMAQDESFLICSV 

830- 850 
atagaaggaagcggtttgctcaagtatgaggacaaaacatgtccgctc 
lEGSGLLKYEDKTCPL 

870 890 910 

aaaaaaggtgatcactttattttgccggctcaaatgcccgattttacg 
KKGDHFILPAQMPDFT 

930 

ataaaaggaac ttgtacccttatcgtgtc tcatatt 
IKGTCTLIVSHI 




